[SPARK-57103][FOLLOWUP][SQL][TESTS] Add test coverage for MIN/MAX over nanosecond-precision timestamp types by stevomitric · Pull Request #56592 · apache/spark

stevomitric · 2026-06-18T10:53:43Z

What changes were proposed in this pull request?

This PR adds end-to-end, Catalyst-level, and golden-file test coverage for the MIN and MAX aggregates (and the closely related min_by / max_by / greatest / least) over the nanosecond-precision timestamp types TIMESTAMP_NTZ(p) / TIMESTAMP_LTZ(p) (p ∈ [7, 9]), part of the nanosecond timestamp preview (SPARK-56822).

No production code is changed. Max/Min are type-agnostic DeclarativeAggregates whose only type gate is TypeUtils.checkForOrderingExpr, which already returns TypeCheckSuccess for the nanosecond types because they are orderable AtomicTypes Comparable were wired in SPARK-57103, the physical value is UnsafeRow-mutable (SPARK-56981), and CodeGenerator.genComp already has a dedicated case _: AnyTimestampNanoType => compareTo arm. Since dataType = child.dataType, the result preserves the input precision. This mirrors the TimeType precedent (SPARK-52626 / SPARK-52660), where becoming orderable made MIN/MAX work and the remaining work was test coverage.

Added tests:

TimestampNanosFunctionsSuiteBase (runs under ANSI on and off): precision/type preservation; sub-microsecond ordering on both the codegen and interpreted paths (two values sharing the same epochMicros but differing within the microsecond); all-NULL and empty input → NULL; GROUP BY a nanosecond key whose sub-microsecond remainder must keep distinct keys from collapsing; min_by/max_by/greatest/least over same-precision columns; and equivalence with the microsecond path when the sub-microsecond digits are zero.
ExpressionTypeCheckingSuite: asserts Max/Min checkInputDataTypes() succeeds for TimestampNTZNanosType / TimestampLTZNanosType and that the result type preserves the precision.
timestamp-ntz-nanos.sql / timestamp-ltz-nanos.sql: MAX/MIN and GROUP BY queries, with regenerated results/ and analyzer-results/ golden files.

Mixed-precision inputs (e.g. greatest(ts_p7, ts_p9), a UNION of two precisions, or comparing an aggregate against a micros literal) route through findWiderDateTimeType, which has no nanosecond arm yet, so they are intentionally out of scope here and tracked by SPARK-57454; every column in these tests is strictly same-precision.

Why are the changes needed?

MIN/MAX over nanosecond timestamps are a core, expected operation, but the nanosecond preview had no aggregate test coverage.

Does this PR introduce any user-facing change?

No. Tests only;

How was this patch tested?

New tests in this PR.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Cursor

MaxGekk

0 blocking, 1 non-blocking, 1 nit.
Clean, well-targeted tests-only PR. I verified the production claims: Max/Min gate on orderability only and dataType = child.dataType, and findWiderDateTimeType genuinely has no nanos arm, so mixed-precision is correctly out of scope (SPARK-57454). Test data is meaningful (sub-microsecond-distinct values would catch micros-truncation; the group-by test asserts no-collapse). Both findings are about test placement / a stale suite doc, neither blocks merge.

Design / architecture (1)

TimestampNanosFunctionsSuiteBase.scala:243: MIN/MAX aggregate tests live in a functions-scoped suite, whereas the cited TimeType analogue (SPARK-52626/52660) placed its aggregate tests in DataFrameAggregateSuite — see inline

Nits: 1 minor item (stale class Scaladoc, folded into the same inline comment).

MaxGekk · 2026-06-18T11:29:04Z

    }
  }
+
+  // ===== MIN / MAX aggregates over nanosecond-precision timestamps (SPARK-56822) =====


These are MIN/MAX aggregate tests, but they land in a suite whose class Scaladoc (lines 28-33) scopes it to "the hour, minute and second functions". The TimeType precedent this PR explicitly mirrors put its analogous aggregate tests in DataFrameAggregateSuite (SPARK-52626 group-by at :3713, SPARK-52660 codegen-split max/min at :3724) — the dedicated aggregate suite. Consider co-locating these there (or in a dedicated nanos-aggregate suite) so aggregate-over-temporal-type coverage stays discoverable in one place.

There's a legitimate pull toward keeping them here — this suite gives the ANSI on/off matrix, the nanos preview-flag session, and the fixed session timezone for free — so this is a judgment call, not a defect.

Minor, regardless of where they end up: the class Scaladoc is now stale. It still says the suite tests only hour/minute/second and that "Each test exercises both the SQL path (selectExpr) and the Scala Column API (functions.hour / minute / second)" — no longer true now that aggregate tests are present (several use only selectExpr). Please update it (or move the tests).

Kept them here for the free ANSI on/off matrix, preview-flag session, and fixed timezone. Updated the class Scaladoc.

…N/MAX aggregate tests Addresses review feedback on apache#56592: the class Scaladoc for TimestampNanosFunctionsSuiteBase still described only the hour/minute/second functions and claimed every test exercises both the SQL and Column API paths, no longer accurate now that the MIN/MAX aggregate tests are present. Rewrote it to cover the datetime functions and the MIN/MAX aggregates and to note the ANSI on/off subclasses. Tests-only; no production change. Co-authored-by: Isaac

stevomitric · 2026-06-18T13:13:08Z

cc @MaxGekk PTAL.

MaxGekk · 2026-06-18T16:04:30Z

+1, LGTM. Merging to master/4.x.
Thank you, @stevomitric.

MaxGekk · 2026-06-18T16:06:22Z

@stevomitric SPARK-57502 is wrong. Please, set correct one.

…AX nanos tests The MIN/MAX-over-nanos test names and golden-file comments were tagged with a placeholder ticket, SPARK-57502, which is in fact an unrelated INFRA issue ("Fix npm vulnerabilities in ui-test and dev"). Rename the references to SPARK-57103 ("Add ordering, compare, and hash for nanosecond timestamp types"), the sub-task of SPARK-56822 that enabled MIN/MAX over the nanosecond types. Tests-only; no production or behavior change. Co-authored-by: Isaac

stevomitric · 2026-06-18T17:44:36Z

@stevomitric SPARK-57502 is wrong. Please, set correct one.

You are right. I set the SPARK-57103 which introduced comparisons which enabled the min/max expressions.

MaxGekk · 2026-06-18T20:38:47Z

@stevomitric Could you resolve conflicts, please.

Resolve conflicts against SPARK-57528 (unix_timestamp over nanos), which edited the same nanosecond-timestamp test files: - TimestampNanosFunctionsSuiteBase.scala: keep both the SPARK-57103 MIN/MAX aggregate tests and the SPARK-57528 unix_timestamp tests. - timestamp-{ntz,ltz}-nanos.sql: keep both query sections. - results/ and analyzer-results/ golden files regenerated. Co-authored-by: Isaac

uros-b

Thank you @stevomitric and @MaxGekk!

MaxGekk · 2026-06-19T10:47:11Z

All tests passed https://github.com/stevomitric/spark/runs/82312660388

+1, LGTM. Merging to master/4.x.
Thank you, @stevomitric and @uros-b for review.

… nanosecond-precision timestamp types ### What changes were proposed in this pull request? This PR adds end-to-end, Catalyst-level, and golden-file test coverage for the `MIN` and `MAX` aggregates (and the closely related `min_by` / `max_by` / `greatest` / `least`) over the nanosecond-precision timestamp types `TIMESTAMP_NTZ(p)` / `TIMESTAMP_LTZ(p)` (`p ∈ [7, 9]`), part of the nanosecond timestamp preview (SPARK-56822). No production code is changed. `Max`/`Min` are type-agnostic `DeclarativeAggregate`s whose only type gate is `TypeUtils.checkForOrderingExpr`, which already returns `TypeCheckSuccess` for the nanosecond types because they are orderable `AtomicType`s `Comparable` were wired in SPARK-57103, the physical value is UnsafeRow-mutable (SPARK-56981), and `CodeGenerator.genComp` already has a dedicated `case _: AnyTimestampNanoType => compareTo` arm. Since `dataType = child.dataType`, the result preserves the input precision. This mirrors the TimeType precedent (SPARK-52626 / SPARK-52660), where becoming orderable made `MIN`/`MAX` work and the remaining work was test coverage. Added tests: - `TimestampNanosFunctionsSuiteBase` (runs under ANSI on and off): precision/type preservation; sub-microsecond ordering on both the codegen and interpreted paths (two values sharing the same `epochMicros` but differing within the microsecond); all-NULL and empty input → NULL; `GROUP BY` a nanosecond key whose sub-microsecond remainder must keep distinct keys from collapsing; `min_by`/`max_by`/`greatest`/`least` over same-precision columns; and equivalence with the microsecond path when the sub-microsecond digits are zero. - `ExpressionTypeCheckingSuite`: asserts `Max`/`Min` `checkInputDataTypes()` succeeds for `TimestampNTZNanosType` / `TimestampLTZNanosType` and that the result type preserves the precision. - `timestamp-ntz-nanos.sql` / `timestamp-ltz-nanos.sql`: `MAX`/`MIN` and `GROUP BY` queries, with regenerated `results/` and `analyzer-results/` golden files. Mixed-precision inputs (e.g. `greatest(ts_p7, ts_p9)`, a `UNION` of two precisions, or comparing an aggregate against a micros literal) route through `findWiderDateTimeType`, which has no nanosecond arm yet, so they are intentionally **out of scope** here and tracked by SPARK-57454; every column in these tests is strictly same-precision. ### Why are the changes needed? `MIN`/`MAX` over nanosecond timestamps are a core, expected operation, but the nanosecond preview had no aggregate test coverage. ### Does this PR introduce any user-facing change? No. Tests only; ### How was this patch tested? New tests in this PR. ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Cursor Closes #56592 from stevomitric/stevomitric/min-max. Authored-by: Stevo Mitric <stevomitric2000@gmail.com> Signed-off-by: Max Gekk <max.gekk@gmail.com> (cherry picked from commit 0d36e25) Signed-off-by: Max Gekk <max.gekk@gmail.com>

initial commit

1d28923

MaxGekk reviewed Jun 18, 2026

View reviewed changes

MaxGekk approved these changes Jun 18, 2026

View reviewed changes

stevomitric changed the title ~~[SPARK-57502][SQL][TESTS] Add test coverage for MIN/MAX over nanosecond-precision timestamp types~~ [SPARK-57103][FOLLOWUP][SQL][TESTS] Add test coverage for MIN/MAX over nanosecond-precision timestamp types Jun 18, 2026

uros-b approved these changes Jun 19, 2026

View reviewed changes

MaxGekk closed this in 0d36e25 Jun 19, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-57103][FOLLOWUP][SQL][TESTS] Add test coverage for MIN/MAX over nanosecond-precision timestamp types#56592

[SPARK-57103][FOLLOWUP][SQL][TESTS] Add test coverage for MIN/MAX over nanosecond-precision timestamp types#56592
stevomitric wants to merge 4 commits into
apache:masterfrom
stevomitric:stevomitric/min-max

stevomitric commented Jun 18, 2026

Uh oh!

MaxGekk left a comment

Uh oh!

MaxGekk Jun 18, 2026

Uh oh!

stevomitric Jun 18, 2026

Uh oh!

stevomitric commented Jun 18, 2026

Uh oh!

MaxGekk commented Jun 18, 2026

Uh oh!

MaxGekk commented Jun 18, 2026

Uh oh!

stevomitric commented Jun 18, 2026

Uh oh!

MaxGekk commented Jun 18, 2026

Uh oh!

uros-b left a comment

Uh oh!

MaxGekk commented Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

stevomitric commented Jun 18, 2026

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

MaxGekk left a comment

Choose a reason for hiding this comment

Design / architecture (1)

Uh oh!

MaxGekk Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

stevomitric Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

stevomitric commented Jun 18, 2026

Uh oh!

MaxGekk commented Jun 18, 2026

Uh oh!

MaxGekk commented Jun 18, 2026

Uh oh!

stevomitric commented Jun 18, 2026

Uh oh!

MaxGekk commented Jun 18, 2026

Uh oh!

uros-b left a comment

Choose a reason for hiding this comment

Uh oh!

MaxGekk commented Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants